source("../source/analysis.R")
incarceration <- read.csv("https://raw.githubusercontent.com/vera-institute/incarceration-trends/master/incarceration_trends.csv")

Introduction + Summary Information

Question: 1, What is the average value of aapi population in jail across all the counties in all years? 2, When is aapi population in jail the highest? 3, When is aapi population in jail the lowest? 4, How much has aapi population in jail change over the last 10 years? 5, What is the standard division of the aapi population in jail?

The variable that I choose is the population that Asian American / Pacific Islander people are in jail. The unit is 0.01, so the number is 100 smaller than the true number of people in jail. I also ask for the mean of this value, since the mean can tell us what is the average number of people that are in jail in all these years. The mean is about 1.98, this should be 198 people in the real world which is not a huge number and means near 200 Asian American / Pacific Islander people may be in jail every year. I also ask for the year that has max and min population. I have found that for the max value, there is only 1 year. But for the min value which is 0, there are many possible years and the years have repeated. So I have to find unique years. I also find the changes in population in recent 10 years. Since the data is in a really early time I do not think they can represent the thought and actions of people in recent times. So I want to see what will be the changes in these 10 years and how can they change. For the std. dev, this can show us how is the data changes. Whether is data is stable or not. We can see the SD is 14.441 which for me I think the data is not stable. Actually, not on AAPI people have been observed. I have also find the same relation between black and total population. I can see that the change between 2008 and 2018 is always positive. This means there are more people in jail in 2008 than 2018. I think this is a huge change to all races since this change means people are less likely to get into troubles. but still we can see by the changes in 10 year and mean value for people in jail that different race of people may have different result. Seems like AAPI people are having less population in jail.

# the average value of aapi population in jail
mean_aapi_jail_pop
## [1] 1.981404
# when is the max of the value of aapi population in jail
time_max_aapi_jail_pop
## [1] 1999
# when is the min of the value of aapi population in jail
time_min_aapi_jail_pop
##  [1] 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
## [16] 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
## [31] 2018 1985 1986 1987 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980
## [46] 1981 1982 1983 1984
# the changes of the value of aapi population in jail from 2008-2018
change_aapi_jail_pop
## [1] 109.36
# the std. dev of the value of aapi population from in jail
SD_aapi_jail_pop 
## [1] 14.44119

Variable comparison chart

For this plot, I have chosen the relation between black people population in jail and white people population in jail. Also, I have shown the year by color in the graph. For lighter colors, the year is more recent. The reason I choose these two data is that I have the relation between time and white people in jail from the upper question. I think comparing with white and black with white’s trend will be easier to see. I use scatter plot since this can show how the relation is distributed. As we can see that most of the point is at the right up corner which means all population is high in most years between 2008 and 2018. Since the points are placed on the diagonal, I think we can say that white and black people are having similar populations at between 2008 and 2018. Or they will have points at place where have high white and low black etc. We can see at the down corner(low white population), the years for those point is 2011,2009,2018. Campared to upper graph, 2018 and 2011 are similar to the trend presented upper which means population in king county is having similar trend that total population in all counties.

ggplotly(scatter)

Map

For this plot, I have shown the map for the US and it contains the black population in jail in the year of 2018. I want to see the distribution of the population ins 2018 with different locations. I have used white and black people in the upper question and white for the first graph, so I decide to use the black population in jail this time. The color of the map have shown the number of population in that area. The lighter the color is, the more population are in jail. For the grey part, the data set just do not have the related data. I can see for places by the west south part. The color is lighter. For they part in middle and east, the color is darker which show there are less population in jail. I think this will happen to be related to the population in that state. Since it is sure that for more population, they will have more chance that more people are in jail.

map